首页> 外文OA文献 >Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training

【2h】

Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training

机译：基于maTLaB的嘈杂多语音语音联合分离与去噪循环神经网络和置换不变训练

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we propose to use utterance-level Permutation InvariantTraining (uPIT) for speaker independent multi-talker speech separation anddenoising, simultaneously. Specifically, we train deep bi-directional LongShort-Term Memory (LSTM) Recurrent Neural Networks (RNNs) using uPIT, forsingle-channel speaker independent multi-talker speech separation in multiplenoisy conditions, including both synthetic and real-life noise signals. Wefocus our experiments on generalizability and noise robustness of models thatrely on various types of a priori knowledge e.g. in terms of noise type andnumber of simultaneous speakers. We show that deep bi-directional LSTM RNNstrained using uPIT in noisy environments can improve the Signal-to-DistortionRatio (SDR) as well as the Extended Short-Time Objective Intelligibility(ESTOI) measure, on the speaker independent multi-talker speech separation anddenoising task, for various noise types and Signal-to-Noise Ratios (SNRs).Specifically, we first show that LSTM RNNs can achieve large SDR and ESTOIimprovements, when evaluated using known noise types, and that a single modelis capable of handling multiple noise types with only a slight decrease inperformance. Furthermore, we show that a single LSTM RNN can handle bothtwo-speaker and three-speaker noisy mixtures, without a priori knowledge aboutthe exact number of speakers. Finally, we show that LSTM RNNs trained usinguPIT generalize well to noise types not seen during training.

机译：在本文中，我们建议同时使用发声级置换不变训练（uPIT）进行与说话者无关的多说话者语音分离和去噪。具体来说，我们使用uPIT训练深度双向双向长期记忆（LSTM）递归神经网络（RNN），在多噪声条件下实现单通道说话者独立的多说话者语音分离，包括合成噪声信号和现实噪声信号。我们将实验的重点放在模型的可推广性和噪声鲁棒性上，这些模型依赖于各种先验知识，例如就噪声类型和同时讲话者的数量而言。我们展示了在嘈杂的环境中使用uPIT训练的深度双向LSTM RNN可以改善说话者独立的多说话者语音分离和去噪的信噪比（SDR）以及扩展的短时客观清晰度（ESTOI）措施具体来说，我们首先证明，在使用已知噪声类型进行评估时，LSTM RNN可以实现较大的SDR和ESTOI改进，并且单个模型能够处理多种噪声类型而性能仅略有下降。此外，我们证明，单个LSTM RNN可以处理两个讲话者和三个讲话者的嘈杂混合物，而无需事先知道确切的讲话者数量。最后，我们证明了使用uPIT训练的LSTM RNN可以很好地推广到训练期间未看到的噪声类型。

著录项

作者
Kolbæk, Morten; Yu, Dong; Tan, Zheng-Hua; Jensen, Jesper;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [J] . Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2017,第10期

机译：深度递归神经网络的话语水平置换不变训练的多说话人语音分离
2. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
3. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [J] . Narayanan A., Wang D. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2015,第1期

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
4. Joint separation and denoising of noisy multi-talker speech using recurrent neural networks and permutation invariant training [C] . Morten Kolbæk, Dong Yu, Zheng-Hua Tan, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing . 2017

机译：递归神经网络和置换不变训练的联合分离和去噪多说话者语音降噪
5. State Denoised Recurrent Neural Networks [D] . Kazakov, Denis. 2018

机译：国家被去噪的经常性神经网络
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks [O] . Kolbæk, Morten, Yu, Dong, Tan, Zheng-Hua, 2017

机译：multiveker语音分离与深度递归神经网络的话语级置换不变训练

Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training

摘要

著录项

相似文献

相关主题

期刊订阅